Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to apply default values from the schema #349

Merged
merged 4 commits into from
Feb 22, 2017

Conversation

erayd
Copy link
Contributor

@erayd erayd commented Jan 17, 2017

For cases when the object being validated does not define a property, but that property has a default value in the schema, set the property value to the default. This PR has been rebased on PR #351.

Copy link
Collaborator

@shmax shmax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this is a good addition, but you'll need to add some unit tests, update the readme.md, and think through a few more use cases.

@@ -45,10 +45,10 @@ public function check($value, $schema = null, JsonPointer $path = null, $i = nul
*
* {@inheritDoc}
*/
public function coerce(&$value, $schema = null, JsonPointer $path = null, $i = null)
public function coerce(&$value, $schema = null, JsonPointer $path = null, $i = null, $default = false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than trail another argument on the end of check and coerce, you might consider adding a config option to drive this. Something like:

abstract class Constraint implements ConstraintInterface
{
//...
    const CHECK_MODE_NORMAL = 0x00000001;
    const CHECK_MODE_TYPE_CAST = 0x00000002;
    const CHECK_MODE_USE_DEFAULTS = 0x00000004;

That way you would only really have to change the code in one place and not have to chase down all the different paths like you're currently doing (and wonder if you missed any).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - will change to use config, add unit tests, and update the readme :-).

Re additional use-cases, was there something specific you are referring to that I've missed? My understanding from reading the spec is that the default can be applied directly, and should be schema-compliant, but isn't mandated to be so.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to wait until we had some unit tests to look at, but you'll want to make sure that setting a default on items works, for example.

* @param string $property Property to set
* @param mixed $value Value to set
*/
protected function &setProperty(&$element, $property, $value)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it necessary to return a reference? Why is it necessary to return anything at all? $value is passed in, so presumably whoever is calling this method already has what it needs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid an extra call to getProperty, and for consistency with that method. The reference is to the newly set property, not to $value (unless $element isn't an object or an array, in which case it behaves the same as $fallback from getProperty).

Would you rather I implemented it without the reference return? The way I've done it feels tidier, to me, but quite happy to change it if you don't like that approach.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there's no law against it, but having a setter return something just feels like an unnecessary mixing of concerns to me. Throwing in the reference is just extra squirm factor. You're only calling this method in one place, and though it's late and my head is fuzzy I really can't see why it's necessary or how its inclusion makes the code any simpler.

if ($default && isset($definition->default) && $property === $undefinedConstraint) {
    $property = $definition->default;
    $this->setProperty($element, $i, $property);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also--and I may be able to come up with something more helpful tomorrow--I remember writing a block of similar object-vs-array set code when I was doing type coercion, but I eventually managed to find a point in the code where the assignment could be done a little more naturally in the inner workings, and it wasn't necessary to shoehorn it in like this. I'm off to bed, but I'll see if I can remember what I'm talking about tomorrow.

@erayd
Copy link
Contributor Author

erayd commented Jan 17, 2017

OK, how's that?

What are your thoughts on adding helper functions in the Validator class (e.g. checkDefault and coerceDefault)? Currently, the only way that I can see to run this with the config option is to pass a Factory object to the Validator constructor.

Haven't done the unit tests yet - those are next on my list.

@shmax
Copy link
Collaborator

shmax commented Jan 17, 2017

Currently, the only way that I can see to run this with the config option is to pass a Factory object to the Validator constructor

That is a bit awkward. I have a PR where I move check mode out of Factory to make this kind of thing easier to do, but I abandoned it when my coercion code evolved into not needing a flag. If you're interested we can revive it...

#320

@shmax
Copy link
Collaborator

shmax commented Jan 17, 2017

Try this schema and value:

$value = json_decode('[ 1 ]');
$schema = json_decode('{
    "type": "array",
    "items": [
        { "type": "number" },
	{ "type": "string", "default": "foo" }
    ]
}');

@shmax
Copy link
Collaborator

shmax commented Jan 17, 2017

When I use this schema and value...

$value = json_decode('{ "foo": 1 }');
$schema = json_decode('{
    "type": "object",
    "properties": {
        "foo": { "type": "number" },
	"bar": { "type": "string", "default": "baz" }
    },
    "required": [ "foo", "bar" ]
}');

The default value is applied, but validation fails ("The property bar is required")

@shmax
Copy link
Collaborator

shmax commented Jan 17, 2017

OK, how's that?

I don't know. I'm looking at the code more closely, now, and I see that the reference was needed because it gets forwarded to checkUndefined for further validation, so I guess my original review was a little shaky. Sorry about that. But I'm wondering if you're filling-in your defaults too early in the process. I have to head off to work, but I'll poke around some more after I get back...

@erayd
Copy link
Contributor Author

erayd commented Jan 17, 2017

Looks like there's some kind of issue with documents that are passed as an array too - discovered that when I was writing unit tests last night. I'm trying to track down what's going on there at the moment.

@erayd
Copy link
Contributor Author

erayd commented Jan 18, 2017

Have updated the PR again. Currently working on adding more tests, finding the root cause of an issue that fails a couple of the tests in assoc mode, and making default options for array items work.

@shmax
Copy link
Collaborator

shmax commented Jan 18, 2017

Sounds good. I'll let you keep plugging away. Let me know when it's really ready for another look.

@erayd
Copy link
Contributor Author

erayd commented Jan 18, 2017

Will do. I'll keep the PR updated so you can see where I'm up to, and will let you know when I'm ready for you to take another look :-).

@erayd erayd force-pushed the apply-defaults branch 3 times, most recently from 9bbf292 to 14cadd2 Compare January 18, 2017 10:12
@@ -66,7 +66,8 @@ public function validateTypes(&$value, $schema = null, JsonPointer $path, $i = n
}

// check object
if ($this->getTypeCheck()->isObject($value)) {
if (TypeCheck\LooseTypeCheck::isObject($value)) { // Fixes failing assoc tests for default values - currently investigating
//if ($this->getTypeCheck()->isObject($value)) { // to find the root cause of this, noting all other assoc tests pass.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strict type-checking here causes two of my tests to fail, but only in assoc mode (both tests relate to setting default values on properties in a child object). It's not immediately obvious what the purpose of a strict check here is - are you able to shed some light on why this is necessary, and why it's not OK to always run $this->checkObject() (via $this->validateTypes) on an associative array?

Changing to a loose check on this line (i.e. allowing associative arrays to be considered objects too) resolves the issue, and allows all tests to succeed, but I don't want to do this if it's going to break something, either now or in the future. If you could provide some insight into the reason behind this restriction, it would be extremely helpful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loose checking stuff is from before my time, but I can dig deeper after work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks :-). I don't suppose anyone involved with the addition of that code is still around? Looks like it was done midway through last year.

Copy link
Collaborator

@shmax shmax Jan 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay.

I've stepped through your code, and the problem is that you are deferring to testValidCases from testValidCasesUsingAssoc, and you're passing along an array instead of an object. If you look in BaseTestCase::testValidCasesUsingAssoc, you'll see that it prepares Validator by passing in $checkMode, and the default value of $checkMode is CHECK_MODE_TYPE_CAST, which will have no problem with associative arrays.

I see that you extended VeryBaseTestCase instead of BaseTestCase; presumably you did that because you wanted a different signature for testValidCasesUsingAssoc and testValidCases. That's okay, but make sure that you flesh out testValidCasesUsingAssoc with the proper boilerplate.

Hope that helps!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks :-).

Unfortunately, turning on CHECK_MODE_TYPE_CAST results in pretty much all the assoc tests breaking, and doesn't fix the issue. Am I perhaps misunderstanding what you're suggesting here?

    /**
     * @dataProvider getValidTests
     */
    public function testValidCasesUsingAssoc($input, $schema, $expectOutput = null)
    {
        if (is_string($input)) {
            $inputDecoded = json_decode($input, true);
        } else {
            $inputDecoded = $input;
        }

        $validator = new Validator(new Factory(null, null, Constraint::CHECK_MODE_TYPE_CAST));
        $validator->coerceDefault($inputDecoded, json_decode($schema));

        $this->assertTrue($validator->isValid(), print_r($validator->getErrors(), true));

        if ($expectOutput !== null) {
            $this->assertEquals($expectOutput, json_encode($inputDecoded));
        }
    }

Using the above for assoc tests, it fails for datasets 1, 2, 3, 4, 5, 6, 7, 8 & 10, regardless of whether or not I'm using loose type checking in UndefinedConstraint::validateTypes(). Using what I originally had, 4 & 5 are the only tests that fail. Using what I originally had plus loose type checking, everything passes.

Copy link
Collaborator

@shmax shmax Jan 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. Now you're hitting the next problem, which is that in LooseTypeCheck::propertySet you are doing a write to $value, which is passed by value, so the write doesn't stick. You need to either pass by reference, or pass in an object. It was working in your old code before because when you deferred to testValidCases, that was converting to object.

Hope that all makes sense. I'm off to bed, but I can help more tomorrow morning if you leave some comments. G'nite!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops - that is definitely a mistake on my part, sorry for wasting your time with that one! Not sure how I missed it, but thanks for finding it.

All sorted now, and no need for the loose type checking any more :-).

*
* {@inheritDoc}
*/
public function coerceDefault(&$value, $schema = null, JsonPointer $path = null, $i = null)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really not a fan of this kind of helper, for a couple of reasons:

  1. You're jamming together two concepts, coercion and default'ing. What happens if I want to coerce, but not default (call coerce, okay)? What happens if I want to default, but not coerce (well, we don't have that one yet). What's going to happen when a third thing comes along? Do we just keep fragmenting this into more and more helpers, one for each possible combination?
  2. It has the feel of a fire-and-forget method, but it secretly twiddles a flag and leaves it there.

Can we please just leave this as a config option, for now? I know it's awkward, but again, once we survive this PR I promise to resurrect my other PR which simplifies the API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. It's a little awkward, but it sounds like you have a plan to resolve that in your other PR.

However... if it's a config option, then how would you like me to handle checking without coercion? The various check() methods currently don't require a reference, which means that defaulting breaks if $value is an array (because PHP doesn't pass arrays by reference unless the reference is explicit).

This is the reason I wrapped it in coerceDefault() in the first place - to prevent people trying to use it with check(), and then wondering why it didn't work. In an ideal world, I could simply alter the many check() methods to take a reference also... but is that something you're OK with?

Copy link
Collaborator

@shmax shmax Jan 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good point. Tell you what--as long as we're going to change the API, let's go all the way and just get rid of coerce, and convert that (back) to a config flag, as well. Then we'll modify check to just take a reference everywhere, and we'll be all set. I didn't do that on my second coercion attempt because I didn't want to break compatibility. In other words, this kind of thing would break:

$validator->check(['foo'=>'bar'], $schema);

But now that I think about it, that's a pretty outlandish use case; I mean, I literally can't think of any rational reason to be doing it, outside of writing sample code for readme files and whatnot.

What do you think? That will should simplify things for us both. If you're game you can make the change in this PR, or I can do it when I move $checkmode back up into Constraint (which you're also welcome to do if you're feeling ambitious).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea a lot, and yes, quite happy to make the change. Do you want it in this PR, or in its own PR (and then I can rebase this after that one is merged)?

I don't see that use-case you mention above as a problem - anyone who needs to do something like that will just assign it to a variable and then check the variable - but it does raise the question of whether it's OK to break this functionality 'in the wild'.

I do think that Validator::coerce() should stay, if only as a flag-setting alias, to prevent breaking things for anyone who is currently using it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are your thoughts on PR #351?

In hindsight, I think the ability to check-by-value needs to stay, as people will be doing things like check(json_decode($someJSON), $schema), and it's important that we don't break that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really digging it! Will have more feedback after work.

@shmax
Copy link
Collaborator

shmax commented Jan 19, 2017

Okay, now I'm REALLY off to bed. Just one parting note: you don't have to keep squashing. When you do that I can't view the latest changes, and the narrative is lost. You can squash at the very end.

@erayd
Copy link
Contributor Author

erayd commented Jan 19, 2017

OK - goodnight, and sleep well :-).

Noted re the squashing, I'll leave the commit history there 'til the end.

@erayd
Copy link
Contributor Author

erayd commented Feb 2, 2017

Have rebased this on PR #351.

@erayd erayd force-pushed the apply-defaults branch 6 times, most recently from 02c52d9 to 941814b Compare February 2, 2017 13:50
@shmax
Copy link
Collaborator

shmax commented Feb 21, 2017

Constraint is the base class for most of the classes in the project. It is definitely an instance (unless something happened that I missed), and is where $checkMode used to live. Have a look here:

31b2954#diff-e7c19f3eac4c5373538af0e003cbb7c0L52

@digitalkaoz got a little carried away in this PR, and $checkMode wound up getting swept along in his changes to the other components ($schemaStorage and such), even though it didn't suffer from the same performance overhead as the other.

I had a PR to make this change at one point... let me see if I can dig it up... ah, here we go:
#320

As you can see I closed it myself once the coerce flag was removed, but now that it's back I figure we can take another look at the idea.

I'm perfectly willing to wait for 6.0.

@erayd
Copy link
Contributor Author

erayd commented Feb 21, 2017

@shmax Constraint is an abstract class, even in the diff you linked to. There is no possible way it could be instantiated.

What you're looking at is the default constructor, which if used to set $checkMode as per the red side of that commit (e.g. via calling parent::__construct() in a subclass), would result in an individual $checkMode for every instance of a Constraint subclass.

@shmax
Copy link
Collaborator

shmax commented Feb 21, 2017

@shmax Constraint is an abstract class

Yep, and abstract classes can have members, same as anything else. Just dig back in the history for Constraint and you'll see that it used to host all the stuff that is now in Factory.

@shmax
Copy link
Collaborator

shmax commented Feb 21, 2017

Here's a snapshot of the file from just before @digitalkaoz gutted it: https://github.com/justinrainbow/json-schema/blob/a918d3b5d9fb3eceb44826343bfe735bcb5c9292/src/JsonSchema/Constraints/Constraint.php

You can see that it has a variety of members, and is still abstract.

@shmax
Copy link
Collaborator

shmax commented Feb 21, 2017

Oh, you might be getting confused because $factory has moved from Constraint down to BaseConstraint, which might be where $checkMode would wind up.

@shmax
Copy link
Collaborator

shmax commented Feb 21, 2017

is an abstract class, even in the diff you linked to. There is no possible way it could be instantiated.

Sure there is! Simply subclass it, implement the interface, and instantiate.

@erayd
Copy link
Contributor Author

erayd commented Feb 21, 2017

@shmax Definitely not confused. I'm sorry, but you're really, really wrong on this one. See here:

Classes defined as abstract may not be instantiated...

Abstract classes may define members, which are inherited by their subclasses, but those members don't exist on the abstract class itself - they exist on subclass instances only.

If you instantiate a subclass, the inherited member has its own value for that subclass instance; it's not shared by all subclass instances.

@erayd
Copy link
Contributor Author

erayd commented Feb 21, 2017

Here's a quick example:

<?php

abstract class ParentClass {
    public $myField = 0;
}

class ChildClassOne extends ParentClass {
    public function __construct()
    {
        $this->myField = 1;
    }
}

class ChildClassTwo extends ParentClass {
    public function getMyField()
    {
        return $this->myField;
    }
}

$one = new ChildClassOne();
$two = new ChildClassTwo();

assert($two->getMyField() === 0); //true

@shmax
Copy link
Collaborator

shmax commented Feb 21, 2017

I never argued that there wouldn't be multiple copies of the value. There would be multiple copies, and in fact there are even now, as Factory feeds a copy of itself to whatever it instantiates. So even after my change there would be the same number of $checkModes floating around, they would just be moved up outside of Factory.

@erayd
Copy link
Contributor Author

erayd commented Feb 21, 2017

Gotcha - misunderstood your original argument. However... Factory does not feed a copy of itself to whatever it instantiates; it's an object - it passes $this by reference. Unless you're deliberately making more for some reason, there's only ever one instance of Factory.

It sounds like you are arguing in favour of going from one instance of $checkMode in a central location, to many instances passed all over the stack the way $coerce used to be? I don't see how that is in any way an improvement. If I've misunderstood, then can you correct me please?

Factory is never cloned anywhere:

steve@neith ~/dev/json-schema $ grep -r clone src
src/JsonSchema/Constraints/Factory.php:        return clone $this->instanceCache[$constraintName];
src/JsonSchema/Entity/JsonPointer.php:        $new = clone $this;

@shmax
Copy link
Collaborator

shmax commented Feb 21, 2017

Ah, no, not arguing in favor of passing it all over the stack. It would just go back to being a member on Constraint. But you have a good point about Factory and references. We could probably still simplify the interface (such that checkmode is passed in as a top level argument when creating Validator then passed to Factory), but we can revisit it later. As I said, let's worry about it in 6.0.

@erayd
Copy link
Contributor Author

erayd commented Feb 21, 2017

OK. I have no objection in principle to simplifying things.

Shall we leave this discussion for now, get this PR merged, and revisit making changes to $checkMode in a later PR?

@shmax
Copy link
Collaborator

shmax commented Feb 22, 2017

Sure, sounds good. By the way, did you ever try setting a default value for items?

Judging by the unit tests looks like you did. LGTM.

@erayd
Copy link
Contributor Author

erayd commented Feb 22, 2017

Yep - see array test cases in DefaultPropertiesTest::getValidTests().

@erayd
Copy link
Contributor Author

erayd commented Feb 22, 2017

Oh, hang on - do you mean defaulting all of items, rather than setting individual default items?

@shmax
Copy link
Collaborator

shmax commented Feb 22, 2017

Mm, the latter. Didn't consider the former. Not too worried about it, at least for now.

@erayd
Copy link
Contributor Author

erayd commented Feb 22, 2017

If I'm understanding your question correctly, the test case labelled 'default item value for an empty array' addresses setting all of items, and the one labelled 'default item value for an array' addresses setting individual items.

If there's a case I've missed, please let me know, and I'll add tests for it.

@shmax
Copy link
Collaborator

shmax commented Feb 22, 2017

Nope, looks great. :shipit:

@erayd
Copy link
Contributor Author

erayd commented Feb 22, 2017

Sweet 😀

@erayd
Copy link
Contributor Author

erayd commented Feb 22, 2017

Thanks for asking a question that made me look at the tests again though - spotted a couple of duplicates in there :-).

@bighappyface
Copy link
Collaborator

5.1.0 it is

@bighappyface bighappyface merged commit 48817e5 into jsonrainbow:master Feb 22, 2017
@bighappyface
Copy link
Collaborator

@erayd
Copy link
Contributor Author

erayd commented Feb 22, 2017

Yay! Thanks @bighappyface :-).

@erayd erayd deleted the apply-defaults branch February 22, 2017 04:00
@mathroc
Copy link
Contributor

mathroc commented Feb 22, 2017

@bighappyface @mathroc Not a fan of the idea; it complicates the API, and weakens it at the same time. The implication is that we don't trust ourselves to not make any unintended edits to the incoming data, but that's what we have unit tests for. If a consumer of this library REALLY wants to make sure that he doesn't lose his original input for some reason AND wants to coerce or apply defaults, he can certainly do so with no help from us

@shmax absolutely cloning the data before handing it to the validator is easy enough

thx all for this PR!

}
}
} elseif ($this->getTypeCheck()->isArray($value)) {
if (isset($schema->properties)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be refactored with https://github.com/justinrainbow/json-schema/pull/349/files#diff-9aeb900077027f2ca2e0a03af4ea6a2fR117 by using the LooseTypeCheck instead of $this->getTypeCheck()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants